Mr-SBC: A Multi-relational Naïve Bayes Classifier
نویسندگان
چکیده
In this paper we propose an extension of the naïve Bayes classification method to the multi-relational setting. In this setting, training data are stored in several tables related by foreign key constraints and each example is represented by a set of related tuples rather than a single row as in the classical data mining setting. This work is characterized by three aspects. First, an integrated approach in the computation of the posterior probabilities for each class that make use of first order classification rules. Second, the applicability to both discrete and continuous attributes by means a supervised discretization. Third, the consideration of knowledge on the data model embedded in the database schema during the generation of classification rules. The proposed method has been implemented in the new system Mr-SBC, which is tightly integrated with a relational DBMS. Testing has been performed on two datasets and four benchmark tasks. Results on predictive accuracy and efficiency are in favour of Mr-SBC for the most complex tasks.
منابع مشابه
Multi-relational Structural Bayesian Classifier
In the traditional na¨ıve Bayes classification method, training data are represented as a single table (or database relation), where each row corresponds to an example and each column to a predictor variable or a target variable. In this paper we propose a multi-relational extension of the na¨ıve Bayes classification method that is characterized by three aspects: first, an integrated approach i...
متن کاملFeature Selection for the Naive Bayesian Classifier Using Decision Trees
It is known that Naïve Bayesian classifier (NB) works very well on some domains, and poorly on some. The performance of NB suffers in domains that involve correlated features. C4.5 decision trees, on the other hand, typically perform better than the Naïve Bayesian a lgorithm on such domains. This paper describes a Selective Bayesian classifier (SBC) that simply uses only those features that C4....
متن کاملAn Efficient Multi-relational Naïve Bayesian Classifier Based on Semantic Relationship Graph
Classification is one of the most popular data mining tasks with a wide range of applications, and lots of algorithms have been proposed to build accurate and scalable classifiers. Most of these algorithms only take a single table as input, whereas in the real world most data are stored in multiple tables and managed by relational database systems. As transferring data from multiple tables into...
متن کاملEfficient Heterogeneous Multi-relational Classification Using Multi-criteria Ranking Approach Based on Characteristics of Multiple Relations
Traditional data mining algorithms will not work efficiently for most of the real world applications where the data is stored in relational format. Even well-known traditional classification technique such as J48, Naïve Bayes often suffers from poor scalability and unsatisfactory predictive performance when it comes to working with relational data. Moreover the performance of existing relationa...
متن کاملSimple Estimators for Relational Bayesian Classifiers
In this paper we present the Relational Bayesian Classifier (RBC), a modification of the Simple Bayesian Classifier (SBC) for relational data. There exist several Bayesian classifiers that learn predictive models of relational data, but each uses a different estimation technique for modeling heterogeneous sets of attribute values. The effects of data characteristics on estimation have not been ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003